Statistical Machine Translation as a Grammar Checker for Persian Language

نویسندگان

  • Nava Ehsan
  • Heshaam Faili
چکیده

Existence of automatic writing assistance tools such as spell and grammar checker/corrector can help in increasing electronic texts with higher quality by removing noises and cleaning the sentences. Different kinds of errors in a text can be categorized into spelling, grammatical and real-word errors. In this article, the concepts of an automatic grammar checker for Persian (Farsi) language, is explained. A statistical grammar checker based on phrasal statistical machine translation (SMT) framework is proposed and a hybrid model is suggested by merging it with an existing rule-based grammar checker. The results indicate that these two approaches are complimentary in detecting and correcting syntactic errors, although statistical approach is able to correct more probable errors. The state-ofthe-art results on Persian grammar checking are achieved by using the hybrid model. The obtained recall is about 0.5 for correction and about 0.57 for detection with precision about 0.63.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Chunk-based Grammar Checker for Detection Translated English Sentences

Machine Translation systems expect target language output to be grammatically correct within the frame of proper grammatical category. In Myanmar-English statistical machine translation system, the target language output (English) can often be ungrammatical. To solve this need, we propose an ongoing chunk-based grammar checker for translated English sentences. Most of the typical grammar checke...

متن کامل

Developing a Chunk-based Grammar Checker for Translated English Sentences

Machine Translation systems expect target language output to be grammatically correct. In Myanmar-English statistical machine translation system, target language output (English) can often be ungrammatical. To address this issue, we propose an ongoing chunk-based grammar checker by using trigram language model and rule based model. It is able to solve distortion, deficiency and make smooth the ...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

The Relationship between EFL Learners’ Explicit Knowledge of Source Language and Their Translation Ability

The purpose of this study was to investigate the relationship between students‘ explicit knowledge in grammar and their translation ability. The importance of grammatical knowledge and its effectiveness in translation quality motivated the researcher to run this study and consider grammatical knowledge in Per- sian as the source language of Iranian students. It is clear that grammar is an area ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011